Context Based Word Prediction for Texting Language
نویسندگان
چکیده
The use of digital mobile phones has led to a tremendous increase in communication using SMS. On a phone keypad, multiple words are mapped to same numeric code. We propose a Context Based Word Prediction system for SMS messaging in which context is used to predict the most appropriate word for a given code. We extend this system to allow informal words (short forms for proper English words). The mapping from informal word to its proper English words is done using Double Metaphone Encoding based on their phonetic similarity. The results show 31% improvement over the traditional frequency based word estimation. Introduction The growth of wireless technology has provided us with many new ways of communication such as SMS (Short Message Service). SMS messaging can also be used to interact with automated systems or participating in contests. With tremendous increase in Mobile Text Messaging, there is a need for an efficient text input system. With limited keys on the mobile phone, multiple letters are mapped to same number (8 keys, 2 to 9, for 26 alphabets). The many to one mapping of alphabets to numbers gives us same numeric code for multiple words. Predictive text systems in place use the frequency-based disambiguation method and predict the most commonly used word above other possible words. T-9 (Text on 9-keys), developed by Tegic Communications, is one such predictive text technology used by LG, Siemens, Nokia Sony Ericson and others in their phones. iTap is another similar system developed and used by Motorola in their phones. T-9 system predicts the correct word for a given numeric code based on frequency. This may not give us the correct result most of the time. For example, for code ‘63’, two possible words are ‘me’ and ‘of’. Based on a frequency list where ‘of’ is more likely than ‘me’, T-9 system will always predict ‘of’ for code ‘63’. So, for a sentence like ‘Give me a box of chocolate’, the prediction would be ‘Give of a box of chocolate’. The sentence itself indeed gives us information about what should be the correct word for a given code. Consider the above sentence with blanks, “Give _ a box _ chocolate”. According to the English grammar, it is more likely that ‘of’ comes after a noun ‘box’ than ‘me’ i.e. it is more likely to see the phrase “box of” than “box me”. The algorithm proposed is an online method that uses this knowledge to correctly predict the word for a given code considering its previous context.
منابع مشابه
Introspective Study of Emotion Icon in Public Chat as a Gesture of Texting
An emotion icon, better known as emoticon is a metacommunicative pictorial representation of a facial expression that, in the absence of body language and prosody, serves to draw a receiver's attention to the tenor or temper of a sender's nominal verbal communication, changing and improving its interpretation. The present study investigates the use of these nonverbal cues in whatsapp public cha...
متن کاملContext-Based Word Prediction and Classification
This paper presents a new approach for word prediction problem. Word prediction is a natural language processing problem that tries to predict the correct word in a given context. Word completion utilities, writing aids, and language translation are among the most common applications of word prediction. In this paper, we describe a new method to predict the correct word given its context. A dat...
متن کاملFirst Language Activation during Second Language Lexical Processing in a Sentential Context
Lexicalization-patterns, the way words are mapped onto concepts, differ from one language to another. This study investigated the influence of first language (L1) lexicalization patterns on the processing of second language (L2) words in sentential contexts by both less proficient and more proficient Persian learners of English. The focus was on cases where two different senses of a polys...
متن کاملSpelling-based Phonics Instruction: It’s Effect on English Reading and Spelling in an EFL Context
Systematic phonics instruction in first language education has recently received considerable research attention due to its critical role in facilitating phonological awareness and processing skills. However, little is known about the effects of systematic phonics instruction on foreign language reading and spelling in an EFL context. This study examined the effects of spelling-based phonics in...
متن کاملWritten word recognition by the elementary and advanced level Persian-English bilinguals
According to a basic prediction made by the Revised Hierarchical Model (RHM), at early stages of language acquisition, strong L2-L1 lexical links are formed. RHM predicts that these links weaken with increasing proficiency, although they do not disappear even at higher levels of language development. To test this prediction, two groups of highly proficie...
متن کامل